AAAI.2023 - Domain(s) of Application | Cool Papers

#1 Show Me the Way! Bilevel Search for Synthesizing Programmatic Strategies [PDF] [Copy] [Kimi]

Authors: David S. Aleixo ; Levi H.S. Lelis

The synthesis of programmatic strategies requires one to search in large non-differentiable spaces of computer programs. Current search algorithms use self-play approaches to guide this search. The issue with these approaches is that the guiding function often provides a weak search signal. This is because self-play functions only measure how well a program performs against other programs. Thus, while small changes to a losing program might not transform it into a winning one, such changes might represent steps in the direction of a winning program. In this paper we introduce a bilevel search algorithm that searches concurrently in the space of programs and in a space of state features. Each iteration of the search in the space of features defines a set of target features that the search in the program space attempts to achieve (i.e., features one observes while following the strategy encoded in a program). We hypothesize the combination of a self-play function and a feature-based one provides a stronger search signal for synthesis. While both functions are used to guide the search in the program space, the self-play function is used to guide the search in the feature space, to allow for the selection of target features that are more likely to lead to winning programs. We evaluated our bilevel algorithm in MicroRTS, a real-time strategy game. Our results show that the bilevel search synthesizes stronger strategies than methods that search only in the program space. Also, the strategies our method synthesizes obtained the highest winning rate in a simulated tournament with several baseline agents, including the best agents from the two latest MicroRTS competitions.

#2 Anytime User Engagement Prediction in Information Cascades for Arbitrary Observation Periods [PDF] [Copy] [Kimi]

Authors: Akshay Aravamudan ; Xi Zhang ; Georgios C. Anagnostopoulos

Predicting user engagement -- whether a user will engage in a given information cascade -- is an important problem in the context of social media, as it is useful to online marketing and misinformation mitigation just to name a couple major applications. Based on split population multi-variate survival processes, we develop a discriminative approach that, unlike prior works, leads to a single model for predicting whether individual users of an information network will engage a given cascade for arbitrary forecast horizons and observation periods. Being probabilistic in nature, this model retains the interpretability of its generative counterpart and renders count prediction intervals in a disciplined manner. Our results indicate that our model is highly competitive, if not superior, to current approaches, when compared over varying observed cascade histories and forecast horizons.

#3 Principled Data-Driven Decision Support for Cyber-Forensic Investigations [PDF] [Copy] [Kimi]

Authors: Soodeh Atefi ; Sakshyam Panda ; Emmanouil Panaousis ; Aron Laszka

In the wake of a cybersecurity incident, it is crucial to promptly discover how the threat actors breached security in order to assess the impact of the incident and to develop and deploy countermeasures that can protect against further attacks. To this end, defenders can launch a cyber-forensic investigation, which discovers the techniques that the threat actors used in the incident. A fundamental challenge in such an investigation is prioritizing the investigation of particular techniques since the investigation of each technique requires time and effort, but forensic analysts cannot know which ones were actually used before investigating them. To ensure prompt discovery, it is imperative to provide decision support that can help forensic analysts with this prioritization. A recent study demonstrated that data-driven decision support, based on a dataset of prior incidents, can provide state-of-the-art prioritization. However, this data-driven approach, called DISCLOSE, is based on a heuristic that utilizes only a subset of the available information and does not approximate optimal decisions. To improve upon this heuristic, we introduce a principled approach for data-driven decision support for cyber-forensic investigations. We formulate the decision-support problem using a Markov decision process, whose states represent the states of a forensic investigation. To solve the decision problem, we propose a Monte Carlo tree search based method, which relies on a k-NN regression over prior incidents to estimate state-transition probabilities. We evaluate our proposed approach on multiple versions of the MITRE ATT&CK dataset, which is a knowledge base of adversarial techniques and tactics based on real-world cyber incidents, and demonstrate that our approach outperforms DISCLOSE in terms of techniques discovered per effort spent.

#4 BETA-CD: A Bayesian Meta-Learned Cognitive Diagnosis Framework for Personalized Learning [PDF] [Copy] [Kimi]

Authors: Haoyang Bi ; Enhong Chen ; Weidong He ; Han Wu ; Weihao Zhao ; Shijin Wang ; Jinze Wu

Personalized learning is a promising educational approach that aims to provide high-quality personalized services for each student with minimum demands for practice data. The key to achieving that lies in the cognitive diagnosis task, which estimates the cognitive state of the student through his/her logged data of doing practice quizzes. Nevertheless, in the personalized learning scenario, existing cognitive diagnosis models suffer from the inability to (1) quickly adapt to new students using a small amount of data, and (2) measure the reliability of the diagnosis result to avoid improper services that mismatch the student's actual state. In this paper, we propose a general Bayesian mETA-learned Cognitive Diagnosis framework (BETA-CD), which addresses the two challenges by prior knowledge exploitation and model uncertainty quantification, respectively. Specifically, we firstly introduce Bayesian hierarchical modeling to associate each student's cognitive state with a shared prior distribution encoding prior knowledge and a personal posterior distribution indicating model uncertainty. Furthermore, we formulate a meta-learning objective to automatically exploit prior knowledge from historical students, and efficiently solve it with a gradient-based variational inference method. The code will be publicly available at https://github.com/AyiStar/pyat.

#5 Set-to-Sequence Ranking-Based Concept-Aware Learning Path Recommendation [PDF] [Copy] [Kimi]

Authors: Xianyu Chen ; Jian Shen ; Wei Xia ; Jiarui Jin ; Yakun Song ; Weinan Zhang ; Weiwen Liu ; Menghui Zhu ; Ruiming Tang ; Kai Dong ; Dingyin Xia ; Yong Yu

With the development of the online education system, personalized education recommendation has played an essential role. In this paper, we focus on developing path recommendation systems that aim to generating and recommending an entire learning path to the given user in each session. Noticing that existing approaches fail to consider the correlations of concepts in the path, we propose a novel framework named Set-to-Sequence Ranking-based Concept-aware Learning Path Recommendation (SRC), which formulates the recommendation task under a set-to-sequence paradigm. Specifically, we first design a concept-aware encoder module which can capture the correlations among the input learning concepts. The outputs are then fed into a decoder module that sequentially generates a path through an attention mechanism that handles correlations between the learning and target concepts. Our recommendation policy is optimized by policy gradient. In addition, we also introduce an auxiliary module based on knowledge tracing to enhance the model’s stability by evaluating students’ learning effects on learning concepts. We conduct extensive experiments on two real-world public datasets and one industrial dataset, and the experimental results demonstrate the superiority and effectiveness of SRC. Code now is available at https://gitee.com/mindspore/models/tree/master/research/recommend/SRC.

#6 Unsupervised Deep Embedded Fusion Representation of Single-Cell Transcriptomics [PDF] [Copy] [Kimi]

Authors: Yue Cheng ; Yanchi Su ; Zhuohan Yu ; Yanchun Liang ; Ka-Chun Wong ; Xiangtao Li

Cell clustering is a critical step in analyzing single-cell RNA sequencing (scRNA-seq) data, which allows us to characterize the cellular heterogeneity of transcriptional profiling at the single-cell level. Single-cell deep embedded representation models have recently become popular since they can learn feature representation and clustering simultaneously. However, the model still suffers from a variety of significant challenges, including the massive amount of data, pervasive dropout events, and complicated noise patterns in transcriptional profiling. Here, we propose a Single-Cell Deep Embedding Fusion Representation (scDEFR) model, which develop a deep embedded fusion representation to learn fused heterogeneous latent embedding that contains both the transcriptome gene-level information and the cell topology information. We first fuse them layer by layer to obtain compressed representations of intercellular relationships and transcriptome information. After that, the zero-inflated negative binomial model (ZINB)-based decoder is proposed to capture the global probabilistic structure of the data and reconstruct the final gene expression information and cell graph. Finally, by simultaneously integrating the clustering loss, crossentropy loss, ZINB loss, and the cell graph reconstruction loss, scDEFR can optimize clustering performance and learn the latent representation in fused information under a joint mutual supervised strategy. We conducted extensive and comprehensive experiments on 15 single-cell RNA-seq datasets from different sequencing platforms to demonstrate the superiority of scDEFR over a variety of state-of-the-art methods.

#7 Constrained Submodular Optimization for Vaccine Design [PDF] [Copy] [Kimi]

Authors: Zheng Dai ; David K. Gifford

Advances in machine learning have enabled the prediction of immune system responses to prophylactic and therapeutic vaccines. However, the engineering task of designing vaccines remains a challenge. In particular, the genetic variability of the human immune system makes it difficult to design peptide vaccines that provide widespread immunity in vaccinated populations. We introduce a framework for evaluating and designing peptide vaccines that uses probabilistic machine learning models, and demonstrate its ability to produce designs for a SARS-CoV-2 vaccine that outperform previous designs. We provide a theoretical analysis of the approximability, scalability, and complexity of our framework.

#8 Flow-Based Robust Watermarking with Invertible Noise Layer for Black-Box Distortions [PDF] [Copy] [Kimi]

Authors: Han Fang ; Yupeng Qiu ; Kejiang Chen ; Jiyi Zhang ; Weiming Zhang ; Ee-Chien Chang

Deep learning-based digital watermarking frameworks have been widely studied recently. Most existing methods adopt an ``encoder-noise layer-decoder''-based architecture where the embedding and extraction processes are accomplished separately by the encoder and the decoder. However, one potential drawback of such a framework is that the encoder and the decoder may not be well coupled, resulting in the fact that the encoder may embed some redundant features into the host image thus influencing the invisibility and robustness of the whole algorithm. To address this limitation, this paper proposes a flow-based robust watermarking framework. The basic component of such framework is an invertible up-down-sampling neural block that can realize the embedding and extraction simultaneously. As a consequence, the encoded feature could keep high consistency with the feature that the decoder needed, which effectively avoids the embedding of redundant features. In addition, to ensure the robustness of black-box distortion, an invertible noise layer (INL) is designed to simulate the distortion and is served as a noise layer in the training stage. Benefiting from its reversibility, INL is also applied as a preprocessing before extraction to eliminate the distortion, which further improves the robustness of the algorithm. Extensive experiments demonstrate the superiority of the proposed framework in terms of visual quality and robustness. Compared with the state-of-the-art architecture, the visual quality (measured by PSNR) of the proposed framework improves by 2dB and the extraction accuracy after JPEG compression (QF=50) improves by more than 4%. Besides, the robustness against black-box distortions can be greatly achieved with more than 95% extraction accuracy.

#9 Identifying and Eliminating Majority Illusion in Social Networks [PDF] [Copy] [Kimi]

Authors: Umberto Grandi ; Lawqueen Kanesh ; Grzegorz Lisowski ; Ramanujan Sridharan ; Paolo Turrini

Majority illusion occurs in a social network when the majority of the network vertices belong to a certain type but the majority of each vertex's neighbours belong to a different type, therefore creating the wrong perception, i.e., the illusion, that the majority type is different from the actual one. From a system engineering point of view, this motivates the search for algorithms to detect and, where possible, correct this undesirable phenomenon. In this paper we initiate the computational study of majority illusion in social networks, providing NP-hardness and parametrised complexity results for its occurrence and elimination.

#10 A Domain-Knowledge-Inspired Music Embedding Space and a Novel Attention Mechanism for Symbolic Music Modeling [PDF] [Copy] [Kimi]

Authors: Zixun Guo ; Jaeyong Kang ; Dorien Herremans

Following the success of the transformer architecture in the natural language domain, transformer-like architectures have been widely applied to the domain of symbolic music recently. Symbolic music and text, however, are two different modalities. Symbolic music contains multiple attributes, both absolute attributes (e.g., pitch) and relative attributes (e.g., pitch interval). These relative attributes shape human perception of musical motifs. These important relative attributes, however, are mostly ignored in existing symbolic music modelling methods with the main reason being the lack of a musically-meaningful embedding space where both the absolute and relative embeddings of the symbolic music tokens can be efficiently represented. In this paper, we propose the Fundamental Music Embedding (FME) for symbolic music based on a bias-adjusted sinusoidal encoding within which both the absolute and the relative attributes can be embedded and the fundamental musical properties (e.g., translational invariance) are explicitly preserved. Taking advantage of the proposed FME, we further propose a novel attention mechanism based on the relative index, pitch and onset embeddings (RIPO attention) such that the musical domain knowledge can be fully utilized for symbolic music modelling. Experiment results show that our proposed model: RIPO transformer which utilizes FME and RIPO attention outperforms the state-of-the-art transformers (i.e., music transformer, linear transformer) in a melody completion task. Moreover, using the RIPO transformer in a downstream music generation task, we notice that the notorious degeneration phenomenon no longer exists and the music generated by the RIPO transformer outperforms the music generated by state-of-the-art transformer models in both subjective and objective evaluations. The code of the proposed method is available online: github.com/guozixunnicolas/FundamentalMusicEmbedding.

#11 MSDC: Exploiting Multi-State Power Consumption in Non-intrusive Load Monitoring Based on a Dual-CNN Model [PDF] [Copy] [Kimi]

Authors: Jialing He ; Jiamou Liu ; Zijian Zhang ; Yang Chen ; Yiwei Liu ; Bakh Khoussainov ; Liehuang Zhu

Non-intrusive load monitoring (NILM) aims to decompose aggregated electrical usage signal into appliance-specific power consumption and it amounts to a classical example of blind source separation tasks. Leveraging recent progress on deep learning techniques, we design a new neural NILM model {\em Multi-State Dual CNN} (MSDC). Different from previous models, MSDC explicitly extracts information about the appliance's multiple states and state transitions, which in turn regulates the prediction of signals for appliances. More specifically, we employ a dual-CNN architecture: one CNN for outputting state distributions and the other for predicting the power of each state. A new technique is invented that utilizes conditional random fields (CRF) to capture state transitions. Experiments on two real-world datasets REDD and UK-DALE demonstrate that our model significantly outperform state-of-the-art models while having good generalization capacity, achieving 6%-10% MAE gain and 33%-51% SAE gain to unseen appliances.

#12 Integrating Reward Maximization and Population Estimation: Sequential Decision-Making for Internal Revenue Service Audit Selection [PDF] [Copy] [Kimi]

Authors: Peter Henderson ; Ben Chugg ; Brandon Anderson ; Kristen Altenburger ; Alex Turk ; John Guyton ; Jacob Goldin ; Daniel E. Ho

We introduce a new setting, optimize-and-estimate structured bandits. Here, a policy must select a batch of arms, each characterized by its own context, that would allow it to both maximize reward and maintain an accurate (ideally unbiased) population estimate of the reward. This setting is inherent to many public and private sector applications and often requires handling delayed feedback, small data, and distribution shifts. We demonstrate its importance on real data from the United States Internal Revenue Service (IRS). The IRS performs yearly audits of the tax base. Two of its most important objectives are to identify suspected misreporting and to estimate the "tax gap" -- the global difference between the amount paid and true amount owed. Based on a unique collaboration with the IRS, we cast these two processes as a unified optimize-and-estimate structured bandit. We analyze optimize-and-estimate approaches to the IRS problem and propose a novel mechanism for unbiased population estimation that achieves rewards comparable to baseline approaches. This approach has the potential to improve audit efficacy, while maintaining policy-relevant estimates of the tax gap. This has important social consequences given that the current tax gap is estimated at nearly half a trillion dollars. We suggest that this problem setting is fertile ground for further research and we highlight its interesting challenges. The results of this and related research are currently being incorporated into the continual improvement of the IRS audit selection methods.

#13 MGTCF: Multi-Generator Tropical Cyclone Forecasting with Heterogeneous Meteorological Data [PDF] [Copy] [Kimi]

Authors: Cheng Huang ; Cong Bai ; Sixian Chan ; Jinglin Zhang ; YuQuan Wu

Accurate forecasting of tropical cyclone (TC) plays a critical role in the prevention and defense of TC disasters. We must explore a more accurate method for TC prediction. Deep learning methods are increasingly being implemented to make TC prediction more accurate. However, most existing methods lack a generic framework for adapting heterogeneous meteorological data and do not focus on the importance of the environment. Therefore, we propose a Multi-Generator Tropical Cyclone Forecasting model (MGTCF), a generic, extensible, multi-modal TC prediction model with the key modules of Generator Chooser Network (GC-Net) and Environment Net (Env-Net). The proposed method can utilize heterogeneous meteorologic data efficiently and mine environmental factors. In addition, the Multi-generator with Generator Chooser Net is proposed to tackle the drawbacks of single-generator TC prediction methods: the prediction of undesired out-of-distribution samples and the problems stemming from insufficient learning ability. To prove the effectiveness of MGTCF, we conduct extensive experiments on the China Meteorological Administration Tropical Cyclone Best Track Dataset. MGTCF obtains better performance compared with other deep learning methods and outperforms the official prediction method of the China Central Meteorological Observatory in most indexes.

#14 MDM: Molecular Diffusion Model for 3D Molecule Generation [PDF] [Copy] [Kimi]

Authors: Lei Huang ; Hengtong Zhang ; Tingyang Xu ; Ka-Chun Wong

Molecule generation, especially generating 3D molecular geometries from scratch (i.e., 3D de novo generation), has become a fundamental task in drug design. Existing diffusion based 3D molecule generation methods could suffer from unsatisfactory performances, especially when generating large molecules. At the same time, the generated molecules lack enough diversity. This paper proposes a novel diffusion model to address those two challenges. First, interatomic relations are not included in molecules' 3D point cloud representations. Thus, it is difficult for existing generative models to capture the potential interatomic forces and abundant local constraints. To tackle this challenge, we propose to augment the potential interatomic forces and further involve dual equivariant encoders to encode interatomic forces of different strengths. Second, existing diffusion-based models essentially shift elements in geometry along the gradient of data density. Such a process lacks enough exploration in the intermediate steps of the Langevin dynamics. To address this issue, we introduce a distributional controlling variable in each diffusion/reverse step to enforce thorough explorations and further improve generation diversity. Extensive experiments on multiple benchmarks demonstrate that the proposed model significantly outperforms existing methods for both unconditional and conditional generation tasks. We also conduct case studies to help understand the physicochemical properties of the generated molecules. The codes are available at https://github.com/tencent-ailab/MDM.

#15 Learning Chemical Rules of Retrosynthesis with Pre-training [PDF] [Copy] [Kimi]

Authors: Yinjie Jiang ; Ying WEI ; Fei Wu ; Zhengxing Huang ; Kun Kuang ; Zhihua Wang

Retrosynthesis aided by artificial intelligence has been a very active and bourgeoning area of research, for its critical role in drug discovery as well as material science. Three categories of solutions, i.e., template-based, template-free, and semi-template methods, constitute mainstream solutions to this problem. In this paper, we focus on template-free methods which are known to be less bothered by the template generalization issue and the atom mapping challenge. Among several remaining problems regarding template-free methods, failing to conform to chemical rules is pronounced. To address the issue, we seek for a pre-training solution to empower the pre-trained model with chemical rules encoded. Concretely, we enforce the atom conservation rule via a molecule reconstruction pre-training task, and the reaction rule that dictates reaction centers via a reaction type guided contrastive pre-training task. In our empirical evaluation, the proposed pre-training solution substantially improves the single-step retrosynthesis accuracies in three downstream datasets.

#16 Online Symbolic Regression with Informative Query [PDF] [Copy] [Kimi]

Authors: Pengwei Jin ; Di Huang ; Rui Zhang ; Xing Hu ; Ziyuan Nan ; Zidong Du ; Qi Guo ; Yunji Chen

Symbolic regression, the task of extracting mathematical expressions from the observed data, plays a crucial role in scientific discovery. Despite the promising performance of existing methods, most of them conduct symbolic regression in an offline setting. That is, they treat the observed data points as given ones that are simply sampled from uniform distributions without exploring the expressive potential of data. However, for real-world scientific problems, the data used for symbolic regression are usually actively obtained by doing experiments, which is an online setting. Thus, how to obtain informative data that can facilitate the symbolic regression process is an important problem that remains challenging. In this paper, we propose QUOSR, a query-based framework for online symbolic regression that can automatically obtain informative data in an iterative manner. Specifically, at each step, QUOSR receives historical data points, generates new x, and then queries the symbolic expression to get the corresponding y, where the (x, y) serves as new data points. This process repeats until the maximum number of query steps is reached. To make the generated data points informative, we implement the framework with a neural network and train it by maximizing the mutual information between generated data points and the target expression. Through comprehensive experiments, we show that QUOSR can facilitate modern symbolic regression methods by generating informative data.

#17 Repair Is Nearly Generation: Multilingual Program Repair with LLMs [PDF] [Copy] [Kimi]

Authors: Harshit Joshi ; José Cambronero Sanchez ; Sumit Gulwani ; Vu Le ; Gust Verbruggen ; Ivan Radiček

Most programmers make mistakes when writing code. Some of these mistakes are small and require few edits to the original program – a class of errors recently termed last mile mistakes. These errors break the flow for experienced developers and can stump novice programmers. Existing automated repair techniques targeting this class of errors are language-specific and do not easily carry over to new languages. Transferring symbolic approaches requires substantial engineering and neural approaches require data and retraining. We introduce RING, a multilingual repair engine powered by a large language model trained on code (LLMC) such as Codex. Such a multilingual engine enables a flipped model for programming assistance, one where the programmer writes code and the AI assistance suggests fixes, compared to traditional code suggestion technology. Taking inspiration from the way programmers manually fix bugs, we show that a prompt-based strategy that conceptualizes repair as localization, transformation, and candidate ranking, can successfully repair programs in multiple languages with minimal effort. We present the first results for such a multilingual repair engine by evaluating on 6 different languages and comparing performance to language-specific repair engines. We show that RING can outperform language-specific repair engines for three of these languages.

#18 Heterogeneous Graph Learning for Multi-Modal Medical Data Analysis [PDF] [Copy] [Kimi]

Authors: Sein Kim ; Namkyeong Lee ; Junseok Lee ; Dongmin Hyun ; Chanyoung Park

Routine clinical visits of a patient produce not only image data, but also non-image data containing clinical information regarding the patient, i.e., medical data is multi-modal in nature. Such heterogeneous modalities offer different and complementary perspectives on the same patient, resulting in more accurate clinical decisions when they are properly combined. However, despite its significance, how to effectively fuse the multi-modal medical data into a unified framework has received relatively little attention. In this paper, we propose an effective graph-based framework called HetMed (Heterogeneous Graph Learning for Multi-modal Medical Data Analysis) for fusing the multi-modal medical data. Specifically, we construct a multiplex network that incorporates multiple types of non-image features of patients to capture the complex relationship between patients in a systematic way, which leads to more accurate clinical decisions. Extensive experiments on various real-world datasets demonstrate the superiority and practicality of HetMed. The source code for HetMed is available at https://github.com/Sein-Kim/Multimodal-Medical.

#19 Rolling Horizon Based Temporal Decomposition for the Offline Pickup and Delivery Problem with Time Windows [PDF] [Copy] [Kimi]

Authors: Youngseo Kim ; Danushka Edirimanna ; Michael Wilbur ; Philip Pugliese ; Aron Laszka ; Abhishek Dubey ; Samitha Samaranayake

The offline pickup and delivery problem with time windows (PDPTW) is a classical combinatorial optimization problem in the transportation community, which has proven to be very challenging computationally. Due to the complexity of the problem, practical problem instances can be solved only via heuristics, which trade-off solution quality for computational tractability. Among the various heuristics, a common strategy is problem decomposition, that is, the reduction of a large-scale problem into a collection of smaller sub-problems, with spatial and temporal decompositions being two natural approaches. While spatial decomposition has been successful in certain settings, effective temporal decomposition has been challenging due to the difficulty of stitching together the sub-problem solutions across the decomposition boundaries. In this work, we introduce a novel temporal decomposition scheme for solving a class of PDPTWs that have narrow time windows, for which it is able to provide both fast and high-quality solutions. We utilize techniques that have been popularized recently in the context of online dial-a-ride problems along with the general idea of rolling horizon optimization. To the best of our knowledge, this is the first attempt to solve offline PDPTWs using such an approach. To show the performance and scalability of our framework, we use the optimization of paratransit services as a motivating example. Due to the lack of benchmark solvers similar to ours (i.e., temporal decomposition with an online solver), we compare our results with an offline heuristic algorithm using Google OR-Tools. In smaller problem instances (with an average of 129 requests per instance), the baseline approach is as competitive as our framework. However, in larger problem instances (approximately 2,500 requests per instance), our framework is more scalable and can provide good solutions to problem instances of varying degrees of difficulty, while the baseline algorithm often fails to find a feasible solution within comparable compute times.

#20 GRIP: Graph Representation of Immune Repertoire Using Graph Neural Network and Transformer [PDF] [Copy] [Kimi]

Authors: Yongju Lee ; Hyunho Lee ; Kyoungseob Shin ; Sunghoon Kwon

The immune repertoire is a collection of immune recep-tors that has emerged as an important biomarker for both diagnostic and therapeutic of cancer patients. In terms of deep learning, analyzing immune repertoire is a challeng-ing multiple-instance learning problem in which the im-mune repertoire of an individual is a bag, and the immune receptor is an instance. Although several deep learning methods for immune repertoire analysis are introduced, they consider the immune repertoire as a set-like struc-ture that doesn’t take account of the nature of the im-mune response. When the immune response occurs, mu-tations are introduced to the immune receptor sequence sequentially to optimize the immune response against the pathogens that enter our body. As a result, immune receptors for the specific pathogen have the lineage of evolution; thus, immune repertoire is better represented as a graph-like structure. In this work, we present our novel method graph representation of immune repertoire (GRIP), which analyzes the immune repertoire as a hier-archical graph structure and utilize the collection of graph neural network followed by graph pooling and transformer to efficiently represents the immune reper-toire as an embedding vector. We show that GRIP predict the survival probability of cancer patients better than the set-based methods and graph-based structure is critical for performance. Also, GRIP provides interpretable re-sults, which prove that GRIP adequately use the progno-sis-related immune receptor and give further possibility to use the GRIP as the novel biomarker searching tool

#21 LagNet: Deep Lagrangian Mechanics for Plug-and-Play Molecular Representation Learning [PDF] [Copy] [Kimi]

Authors: Chunyan Li ; Junfeng Yao ; Jinsong Su ; Zhaoyang Liu ; Xiangxiang Zeng ; Chenxi Huang

Molecular representation learning is a fundamental problem in the field of drug discovery and molecular science. Whereas incorporating molecular 3D information in the representations of molecule seems beneficial, which is related to computational chemistry with the basic task of predicting stable 3D structures (conformations) of molecules. Existing machine learning methods either rely on 1D and 2D molecular properties or simulate molecular force field to use additional 3D structure information via Hamiltonian network. The former has the disadvantage of ignoring important 3D structure features, while the latter has the disadvantage that existing Hamiltonian neural network must satisfy the “canonial” constraint, which is difficult to be obeyed in many cases. In this paper, we propose a novel plug-and-play architecture LagNet by simulating molecular force field only with parameterized position coordinates, which implements Lagrangian mechanics to learn molecular representation by preserving 3D conformation without obeying any additional restrictions. LagNet is designed to generate known conformations and generalize for unknown ones from molecular SMILES. Implicit positions in LagNet are learned iteratively using discrete-time Lagrangian equations. Experimental results show that LagNet can well learn 3D molecular structure features, and outperforms previous state-of-the-art baselines related molecular representation by a significant margin.

#22 Steganography of Steganographic Networks [PDF] [Copy] [Kimi]

Authors: Guobiao Li ; Sheng Li ; Meiling Li ; Xinpeng Zhang ; Zhenxing Qian

Steganography is a technique for covert communication between two parties. With the rapid development of deep neural networks (DNN), more and more steganographic networks are proposed recently, which are shown to be promising to achieve good performance. Unlike the traditional handcrafted steganographic tools, a steganographic network is relatively large in size. It raises concerns on how to covertly transmit the steganographic network in public channels, which is a crucial stage in the pipeline of steganography in real world applications. To address such an issue, we propose a novel scheme for steganography of steganographic networks in this paper. Unlike the existing steganographic schemes which focus on the subtle modification of the cover data to accommodate the secrets. We propose to disguise a steganographic network (termed as the secret DNN model) into a stego DNN model which performs an ordinary machine learning task (termed as the stego task). During the model disguising, we select and tune a subset of filters in the secret DNN model to preserve its function on the secret task, where the remaining filters are reactivated according to a partial optimization strategy to disguise the whole secret DNN model into a stego DNN model. The secret DNN model can be recovered from the stego DNN model when needed. Various experiments have been conducted to demonstrate the advantage of our proposed method for covert communication of steganographic networks as well as general DNN models.

#23 PEN: Prediction-Explanation Network to Forecast Stock Price Movement with Better Explainability [PDF] [Copy] [Kimi]

Authors: Shuqi Li ; Weiheng Liao ; Yuhan Chen ; Rui Yan

Nowadays explainability in stock price movement prediction is attracting increasing attention in banks, hedge funds and asset managers, primarily due to audit or regulatory reasons. Text data such as financial news and social media posts can be part of the reasons for stock price movement. To this end, we propose a novel framework of Prediction-Explanation Network (PEN) jointly modeling text streams and price streams with alignment. The key component of the PEN model is an shared representation learning module that learns which texts are possibly associated with the stock price movement by modeling the interaction between the text data and stock price data with a salient vector characterizing their correlation. In this way, the PEN model is able to predict the stock price movement by identifying and utilizing abundant messages while on the other hand, the selected text messages also explain the stock price movement. Experiments on real-world datasets demonstrate that we are able to kill two birds with one stone: in terms of accuracy, the proposed PEN model outperforms the state-of-art baseline; on explainability, the PEN model are demonstrated to be far superior to attention mechanism, capable of picking out the crucial texts with a very high confidence.

#24 Decision-Making Context Interaction Network for Click-Through Rate Prediction [PDF] [Copy] [Kimi]

Authors: Xiang Li ; Shuwei Chen ; Jian Dong ; Jin Zhang ; Yongkang Wang ; Xingxing Wang ; Dong Wang

Click-through rate (CTR) prediction is crucial in recommendation and online advertising systems. Existing methods usually model user behaviors, while ignoring the informative context which influences the user to make a click decision, e.g., click pages and pre-ranking candidates that inform inferences about user interests, leading to suboptimal performance. In this paper, we propose a Decision-Making Context Interaction Network (DCIN), which deploys a carefully designed Context Interaction Unit (CIU) to learn decision-making contexts and thus benefits CTR prediction. In addition, the relationship between different decision-making context sources is explored by the proposed Adaptive Interest Aggregation Unit (AIAU) to improve CTR prediction further. In the experiments on public and industrial datasets, DCIN significantly outperforms the state-of-the-art methods. Notably, the model has obtained the improvement of CTR+2.9%/CPM+2.1%/GMV+1.5% for online A/B testing and served the main traffic of Meituan Waimai advertising system.

#25 Fine-Grained Position Helps Memorizing More, a Novel Music Compound Transformer Model with Feature Interaction Fusion [PDF] [Copy] [Kimi]

Authors: Zuchao Li ; Ruhan Gong ; Yineng Chen ; Kehua Su

Due to the particularity of the simultaneous occurrence of multiple events in music sequences, compound Transformer is proposed to deal with the challenge of long sequences. However, there are two deficiencies in the compound Transformer. First, since the order of events is more important for music than natural language, the information provided by the original absolute position embedding is not precise enough. Second, there is an important correlation between the tokens in the compound word, which is ignored by the current compound Transformer. Therefore, in this work, we propose an improved compound Transformer model for music understanding. Specifically, we propose an attribute embedding fusion module and a novel position encoding scheme with absolute-relative consideration. In the attribute embedding fusion module, different attributes are fused through feature permutation by using a multi-head self-attention mechanism in order to capture rich interactions between attributes. In the novel position encoding scheme, we propose RoAR position encoding, which realizes rotational absolute position encoding, relative position encoding, and absolute-relative position interactive encoding, providing clear and rich orders for musical events. Empirical study on four typical music understanding tasks shows that our attribute fusion approach and RoAR position encoding brings large performance gains. In addition, we further investigate the impact of masked language modeling and casual language modeling pre-training on music understanding.